integrate hashlogger#3647
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3647 +/- ##
==========================================
- Coverage 58.97% 58.64% -0.34%
==========================================
Files 2263 2226 -37
Lines 187223 184451 -2772
==========================================
- Hits 110421 108171 -2250
+ Misses 66858 66508 -350
+ Partials 9944 9772 -172
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| done := make(chan struct{}) | ||
| select { | ||
| case h.controlChan <- controlMessage{kind: ctrlColumnChange, hashType: hashType, add: add, done: done}: | ||
| select { |
There was a problem hiding this comment.
sendColumnChange returns nil when the context is cancelled or the sender context is done:
Consider returning a sentinel error or at least h.ctx.Err() / h.senderCtx.Err() so the caller knows the registration didn't actually land.
There was a problem hiding this comment.
function now returns an error if context is cancelled
| } | ||
|
|
||
| // containsString reports whether s is present in xs. | ||
| func containsString(xs []string, s string) bool { |
There was a problem hiding this comment.
Nit: The containsString function can be replaced with slices.Contains
There was a problem hiding this comment.
Converted to set membership, no longer required
| } | ||
| } | ||
| } | ||
| for _, category := range desired { |
There was a problem hiding this comment.
A set-based approach would be clearer?
desiredSet := make(map[string]struct{}, len(desired))
for _, c := range desired { desiredSet[c] = struct{}{} }
| // stores as the next block's header.LastResultsHash; logging it per block surfaces gas/result | ||
| // divergence (e.g. between executors) independently of the state AppHash. Only computed when the | ||
| // commit store records hashes, and never fails the block on a marshal error. | ||
| if _, ok := app.cms.(interface{ SetNextResultHash([]byte) }); ok { |
There was a problem hiding this comment.
even sc-hash-logger-enable = false, interface{ SetNextResultHash([]byte) } is always true?
PR SummaryMedium Risk Overview Configuration flows through new baseapp passes the block header hash and a merkle result hash (from finalized tx results, only when rootmulti owns orchestration: lazy logger open, dynamic column sync as backends change during migration, changeset capture on the first flush per block, memIAVL, flatKV, and composite expose Reviewed by Cursor Bugbot for commit 1153f77. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 92c54f3. Configure here.
| } | ||
| } | ||
| } | ||
| rs.hashCategories = desired |
There was a problem hiding this comment.
Failed hash category registration skipped
Medium Severity
In syncHashCategories, when RegisterHashType fails, the error is only logged but rs.hashCategories is still replaced with the full desired set. Later commits treat that category as already synced and never call RegisterHashType again, so ReportHash keeps failing with an unknown type and that column stays missing from the hash log until restart.
Reviewed by Cursor Bugbot for commit 92c54f3. Configure here.
There was a problem hiding this comment.
This PR wires the per-block hash logger into rootmulti's commit path (app hash, block hash, result hash, memIAVL/flatKV per-store hashes, changeset), adds runtime column register/unregister with file rotation, and makes retention limits 0-disable-able. The implementation is well-tested and the integration points degrade safely; no blocking correctness issues were found, only a config-consistency note and an ops/performance consideration about the default-on feature sitting in the consensus commit path.
Findings: 0 blocking | 4 non-blocking | 0 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- (Codex)
sei-cosmos/server/config/config.goGetConfig()builds a freshStateCommitConfigliteral withoutHashLogger, soconfig.GetConfig(...).StateCommit.HashLoggeris zero-valued (Enable=false, all retention=0). This file is not in the PR diff so it can't be anchored inline. Verified it is currently harmless: the live hash-logger store is built fromapp.parseSCConfigs(app/seidb.go:80,102), and app.toml is rendered fromcustomAppConfig/serverconfig.DefaultConfig()(cmd/seid/cmd/root.go:384,415), not fromGetConfig. Still worth seedingStateCommitfromconfig.DefaultStateCommitConfig()(or parsing thesc-hash-logger-*keys) inGetConfigto avoid a latent footgun: if any tooling ever renders app.toml from aGetConfig-derived Config, it would emitsc-hash-logger-enable = falseand on the next load silently disable hash logging. - Performance/ops: hash logging is enabled by default and
recordBlockHashesruns insiderootmulti.Store.Commitwhile holdingrs.mtx(the consensus commit hot path).ReportHash/ReportChangeset/RegisterHashTypedo blocking sends on the logger's buffered control/writer channels, so sustained writer backpressure (slow or near-full disk) could add latency to block commits. It is not a deadlock risk (a fatal logger error cancels senderCtx so sends abort, and the store then disables logging), but the team should confirm the default buffer sizes absorb expected bursts before shipping this on by default. cursor-review.mdis empty — the Cursor second-opinion pass produced no output.REVIEW_GUIDELINES.mdis also empty, so no repo-specific review standards were applied.- Minor: in
rootmulti.flush, the capturedrs.blockChangeSetsaliases the samechangeSetsslice that is also handed tossStore.ApplyChangesetAsync, and it is later consumed asynchronously by the hash logger's hasher goroutine viaReportChangeset. This is safe only if both async consumers treat the changeset as read-only; worth a confirming comment since both run concurrently.
There was a problem hiding this comment.
Wires a per-block hash logger into the commit path (block/result/app/memIAVL/flatKV hashes) with dynamic column registration and retention; the implementation is well-tested and disabled in the test harness. One real but edge-case correctness issue at column-set changes (flagged by Codex) and a couple of default-on hot-path considerations are worth noting, none blocking for a debug/forensics tool.
Findings: 0 blocking | 5 non-blocking | 1 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- Column-change vs. in-flight changeset race (agrees with Codex): rootmulti's
syncHashCategories()runs at the start of each block'srecordBlockHashes, before that block'sReportChangeset. If the category set changes (module add/remove at an upgrade, or memIAVL→flatKV at migration) while the previous block's changeset hash is still being computed asynchronously by the hasher,handleColumnChangeonlydrainComplete()s blocks that are already complete. The prior block is incomplete (its changeset hasn't returned), so the column set is mutated underneath it. Completeness is then judged by count against the newhashTypes: if a column was net-added the prior block can never complete (dropped on clean shutdown with an Info log, or force-flushed partial under buffer pressure); if a column was net-removed it can be emitted with the changeset still missing. This is confined to rare migration/upgrade boundaries and to a non-consensus debugging tool, so it is not blocking, but it undercuts the tool's forensic accuracy at exactly the boundaries operators care about. Consider draining in-flight changesets (or otherwise quiescing pending blocks) before applying a column change, or keying completeness on the per-block expected column set rather than a live count. - Hash logging is enabled by default and now sits inline in the consensus-critical
Commitpath (recordBlockHashesis called underrs.mtx).ReportHash/ReportChangesetdo blocking sends to the control loop, which backpressures through the buffered writer channel. Buffers absorb transient stalls, but a sustained slow/near-full disk could backpressure block commit. Worth confirming this risk is acceptable for a default-on feature, or documenting the operator escape hatch (sc-hash-logger-enable=false). - With the default
BlocksToRetain=0and an operator-setMaxDiskSize=0, both retention dimensions are disabled and the hash log grows without bound. This is documented as a deliberate choice and the defaultMaxDiskSize(16 GiB) keeps the out-of-box config bounded, so this is just a heads-up. - The Cursor second-opinion review file (cursor-review.md) was empty — that pass produced no output.
- 1 suggestion(s)/nit(s) flagged inline on specific lines.
| } | ||
|
|
||
| // Flush everything that is complete under the current columns to the current file. | ||
| h.drainComplete() |
There was a problem hiding this comment.
[suggestion] handleColumnChange only flushes blocks that are already complete (drainComplete) before mutating hashTypes. A block whose changeset hash is still in flight on the hasher thread is not complete here, so the column set changes underneath it. Since drainComplete/flushCompleteOnShutdown judge completeness by len(Hashes) < len(h.hashTypes) (a count, not a column-name match), once a column is net-added that in-flight block can never reach the new count and is dropped on clean shutdown (or force-flushed partial under buffer pressure); a net-removed column can let it emit with the changeset still missing. In this PR's caller (rootmulti.syncHashCategories runs before the block's ReportChangeset), this can trigger at module add/remove or memIAVL→flatKV migration boundaries while the prior block's changeset is pending. Consider draining in-flight changesets before applying the column change, or tracking each pending block's expected column set rather than relying on a live count.
* main: feat(seid): ConfigManager selection seam (PLT-775 PR1) (#3671) fix(evmrpc): limit listener max open connections, configurable via max_open_connections (PLT-704) (#3637) LittDB: Keymap threading improvements (#3645) integrate hashlogger (#3647) fix(metrics): Prometheus metrics output (#3640) [codex] Harden multiversion iterator validation (#3656) feat(consensus): mock_chain_validation replay build + memIAVL state-sync restore fixes (#3663) chore: replace OLD red SeiLogo banner in README with new 2026 Sei lockup (#3670) Require absolute path for evmone lib (#3668) fix(evmrpc): apply getLogs maxLog cap during merge instead of after (PLT-687) (#3666) feat(evmrpc): pre-decode request size admission control (PLT-295) (#3648) Make autobahn block production check wait for progress (#3667) fix(sei-tendermint): prevent readRoutine goroutine leak on /websocket when writeChan is full (PLT-707) (#3664) Per-block littidx flush + single shard (gated on #3645) (#3660) fix(evmrpc): bound debug_traceStateAccess memory and add trace admission control (PLT-360) (#3653) [codex] bump go-ethereum to v1.15.7-sei-17 (#3657) Upodate checkout GHA step across all workflows (#3659) Add GoReleaser release pipeline for static seid binaries (#3425) Parallelize littidx eth_getLogs across blocks (#3652)


Describe your changes and provide context
Wire in the hashlogger to report hashes for lots of different things